BUG: avoid float upcast when mixing signed/unsigned ints in isin #62608

divya1974 · 2025-10-07T06:00:42Z

…ts added)

closes BUG: Implicit conversion to float64 with isin() #61676 61676
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Summary
Fix incorrect Series.isin results when comparing signed int64 values with uint64 values that are not equal. Previously, mixing signed and unsigned 64-bit integers could trigger a numeric common-type coercion to float64 which may lose precision and produce false positives. This change prevents that unsafe upcast by preferring an object-based comparison when signed and unsigned integer types are mixed.

Root cause
When isin attempted to find a common numeric dtype between comps (left side) and values (right side), mixing signed int64 with uint64 could lead to casting both sides to float64. Converting large 64-bit integers to float64 loses precision and can make two distinct integers compare as equal.

What I changed
[algorithms.py]:
Adjusted the condition used before converting values to an object array so that when dtypes differ and either side is an unsigned integer, [values] is converted to object (i.e., Python-level equality / hashtable lookup) instead of numeric coercion. This makes the mixed signed/unsigned decision symmetric and avoids unsafe float upcasts.
[test_isin.py]
Added test_isin_int64_vs_uint64_mismatch which reproduces the reported case and asserts the correct False result.

…ts added)

mroeschke · 2025-10-07T16:31:13Z

pandas/core/algorithms.py

+            # If the dtypes differ and either side is unsigned integer,
+            # prefer object dtype to avoid unsafe upcast to float64 that
+            # can lose precision for large 64-bit integers.


Does this change the performance when values and comps are both integer like and fit within an integer type?

The fix takes the conservative approach of converting values to [object] when mixing signed and unsigned integer dtypes to ensure correctness. This preserves exact integer equality but may be slower for very large arrays compared to a numeric-only path.
This trade-off is favoring correctness over the rare case of very large arrays with mixed signed/unsigned ints

Another approach for better performance can be to remove the earlier asymmetric object-conversion block and add a fast, safe numeric path that correctly handles signed/unsigned mixes without converting to object.

The idea is to use masked uint64 lookups to avoid float casts and preserve performance. I’ll place this fast-path after the comps_array extraction and before the common-type coercion, by mapping signed int64 and uint64 values into the wider unsigned space and performing hashtable lookups on uint64.

This will involve changes roughly around lines algorithms.py+6-16. I’ll also run the new tests afterward to verify behavior. Not sure if that breaks something, should I try?

Removed comments from the test case for isin method.

Clarified comments regarding fast-path application for integer widths.

WillAyd

Thanks for the ping. You'll also want to add a whatsnew note for this to v3.0.0

WillAyd · 2025-10-15T18:38:55Z

pandas/core/algorithms.py

+        if (
+            values.dtype.kind in "iu"
+            and comps_array.dtype.kind in "iu"
+            # Only apply fast-path for 64-bit integer widths to avoid


Can you expand on this comment? I don't fully understand what the surprising behavior is, and I'm worried its a red herring

WillAyd · 2025-10-15T18:39:55Z

pandas/core/algorithms.py

+                        values_u = values.astype("uint64", copy=False)
+                        comps_u = comps_array.astype("uint64", copy=False)
+                        return htable.ismember(comps_u, values_u)
+            except Exception:


We almost never toss a base Exception in the code base. What is this trying to catch here?

BUG: avoid float upcast when mixing signed/unsigned ints in isin (tes…

7350469

…ts added)

mroeschke reviewed Oct 7, 2025

View reviewed changes

divya1974 added 4 commits October 8, 2025 12:31

Merge branch 'main' into fix/61676-isin-signed-unsigned

6dc52f6

performance improved approach

6b417e7

fix tests

b65a9f2

Merge branch 'main' into fix/61676-isin-signed-unsigned

13dcc7b

divya1974 requested a review from mroeschke October 15, 2025 04:20

divya1974 added 2 commits October 15, 2025 12:37

Clean up comments in test_isin_int64_vs_uint64_mismatch

3a7487d

Removed comments from the test case for isin method.

Refine comments on fast-path for integer itemsize

12a93fb

Clarified comments regarding fast-path application for integer widths.

WillAyd requested changes Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: avoid float upcast when mixing signed/unsigned ints in isin #62608

BUG: avoid float upcast when mixing signed/unsigned ints in isin #62608

divya1974 commented Oct 7, 2025

Uh oh!

mroeschke Oct 7, 2025

Uh oh!

divya1974 Oct 8, 2025

Uh oh!

divya1974 Oct 8, 2025

Uh oh!

WillAyd left a comment

Uh oh!

WillAyd Oct 15, 2025

Uh oh!

WillAyd Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

BUG: avoid float upcast when mixing signed/unsigned ints in isin #62608

Are you sure you want to change the base?

BUG: avoid float upcast when mixing signed/unsigned ints in isin #62608

Conversation

divya1974 commented Oct 7, 2025

Uh oh!

mroeschke Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

divya1974 Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

divya1974 Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

WillAyd Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

WillAyd Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants